The United States of America is a relatively new country. It has only been around for a little over 200 years. During this period of time, it has been through a number of very different phases that have led the USA to become a major global power: from early expansionism, civil war and industrialization, world wars, cold war, and modern America.
The question we ask in this study is: how has the USA’s evolution is history seems reflected on its leaders’ speeches? For this, we analize the State of the Union speeches that are given yearly since 1790. Our objective is to get insights about the general sentiment, as well as the relevance and sentiment towards specific entities related to each period. Our main assumption is that a leader’s speech will reflect the overall nation’s sentiment.
To do this, we divided our analysis in 6 different periods that we consider characterize the history of the United States. The periods are the following:
(1789-1861) A New Nation: This period was characterized by territorial expansion and massive immigration from Europe and Asia. Most of the states from present USA, were incorporated in this period. In general, it was a prosperous era, ended by the worst conflict the country has yet seen: The Civil War.
(1861 - 1913) Civil War and Industrialization: This period starts with the Civil War (1861-1865) and continues with the late XIX and early XX centuries. Great technological changes occurred in this period. Electricity was invented, and transportation was revolutioned.
(1914 - 1945) World Wars: Enormous changes occurred worldwide during this period. Two major wars and the greatest economic depression shaped the modern world as we know it. USA, together with the Soviet Union reaffirmed their position as major world powers.
(1945 - 1989) Cold War: This was a period of big tensions between countries. Some conflicts arose suh as Korean War and Vietnam. Economically speaking, there was tremendous growth and innovation. Internet and space exploration started. The end of this era comes with the fall of the Berlin Wall, and the subsequent fall of the Soviet Union.
(1990 - 2001) Nineties: This was a period of economic prosperity in general. While the “communist threat” had vanished, a new “enemy” came into scene: Terrorists.
(2002-2016) Modern America: The most recent years, there has been tremendous technological innovation. However, a major crisis and global terrorism have been a constant topic in mouth of everybody.
We divide our analysis in four parts:
General sentiment analysis: We explore the sentiment of presidential speeches over time, and relate some of the observations to historical happennings.
Targeted analysis: We breakdown the sentiment analysis into different entities, to observe which are the entities that are associated with more positive, negative, or fluctuating sentiments over time.
Word analysis by party: We analyze how presidential speeches reflect fundamental differences between parties, and explore how these differences have evolved over time.
Entity network analysis. We do a general overview of the relationship between entities that are relevant to the USA. We group such entities in several subgroups according to their network.
We want to analyze the sentiment, which is the attitude, opinion or feeling, of speeches during those six periods by using Alchemy API. For example, if a speech contains pessimistic view of the economy, or expresss great concerns about a war, it is classified as negative sentiment. In contrast, if the speech shows optimistic attitudes on future economic situation, or reflects strong optimism about winning a war, it is classified it as positive sentiment.
In the above Speech Sentiment graph, we represents speeches in different colors according to the incumbent president’s political party: Following tradition, we plotted democrats in blue, republicans in red, and neither in black. The y axis represents the intensity of the sentiment, where zero represents neutral sentiment. It is important to note that we categorized presidents belonging to the historical “Whig” and “Republican-Democrat” parties as Republicans, while the “Federalists” where categorized as Democrats. This categorization was made according to the history of such parties.
Overall, more than 90% speeches are classified as positive sentiment. It is not difficult to explain because it is expected that presidents will avoid delivering negative speeches in public very frequently. Even during the hardest periods, such as great depression and world war II, the president has to encourage their citizens to raise moral.
Although very few speeches are classified as negative sentiment, it is possible to relate each one of them to big events in history. For example, those several speeches at the end of A New Nation period were correlated to the advent of the Civil War. Before the outbreak of the such event, the north and south had irreconcilable conflicts on some crucial issues such as slavery. The prospect of war hung over the entire country. The negative speeches in World Wars period are related to the great depression and the outbreak of the world war II. Lastly, those negative sentiment speeches from 2002 to the present are connected to 9/11, the Iraq war, and mostly to the 2008-2009 economic crisis.
Next we focus on analyzing specific entities mentioned in speeches. We take four different criteria to select which words to visualize:
We find some interesting results, for example, government has been regarded as one of the most negative entities. Another observation is that China has become the most positive entity in recent years. Also, the sentiment towards British government has been changing overtime due to the fluctuations in these two countries’ relationships. Similarly, many terms can be compared to see how their sentiment has changed over time
Let’s change to another perspective. We’ve plotted the histogram for relevance of top 20 most frequently used entities. The relevance score measures how frequent such topic is mentioned by a specific president.
From above we can observe that Congress is metioned by all presidents, as it’s the common practice in the address. Navy has barely mentioned since the Cold War. It may due to the fact that navy force development is not the top priority, since the U.S government was implementing the Star Wars Program.
As a next section we added an analysis to compare republicans and democrats oer time. We compared, for each period, which were the most common words for republican and democrat presidents.
We calculated a “similarity score” between republican and democrat speechesfor each period. This score was calculated by computing the correlation between the vector of relative frequencies of words for each party, and therefore, it goes from -1 to 1.
We first plotted the similarity score analysis to show how republicans and democrats have been diverging in their speeches. While early presidents typically used more general common terms, such as people, citizens, consitution, etc. Mordern presidents have diverged. Modern republicans tend to speak more about terrorism, fear, and war; while modern democrats tend to focus on terms like welfare, jobs, health care, and so on.
It is worth noting the low similarity score observed during the World War periods.The reason of this difference is that democrat presidents had to deal with both World Wars, while Republican presidents dealt with a relatively prosperous 20s decade. Therefore, their speeches were radically different.
With the hope of further understanding the differences between parties in different time periods, we decided to show comparisons from the first and the last periods: A New Nation and Modern America.
Below, we show a 3 analysis for each one of the two periods.
| Top words Republican | Score | Top words Democrat | Score |
|---|---|---|---|
| great | 0.0016912 | upon | -0.0023555 |
| millions | 0.0013367 | people | -0.0022939 |
| nation | 0.0013179 | mexico | -0.0020358 |
| last | 0.0012010 | constitution | -0.0019729 |
| british | 0.0010091 | public | -0.0014277 |
| spain | 0.0008060 | president | -0.0012999 |
| commerce | 0.0008033 | bank | -0.0012531 |
| nations | 0.0008010 | general | -0.0011208 |
| view | 0.0007961 | power | -0.0010619 |
| improvement | 0.0007842 | money | -0.0010279 |
| consideration | 0.0007269 | banks | -0.0010138 |
| course | 0.0006950 | state | -0.0010076 |
| peace | 0.0006660 | federal | -0.0009782 |
| progress | 0.0006492 | duty | -0.0008222 |
| thought | 0.0006437 | mexican | -0.0008132 |
| militia | 0.0006262 | thus | -0.0008050 |
| session | 0.0006239 | question | -0.0007280 |
| establishment | 0.0006038 | character | -0.0007250 |
| tribes | 0.0005943 | present | -0.0007201 |
| several | 0.0005903 | republic | -0.0007178 |
The remedial policy, the principles and policy of augmenting the military defenses recommended by every branch of the precedent of the President shall exercise his own Government, and that for that mutual good will of those sales during the last session of Congress, the next fiscal year of $404,878.53, or more propriety than the public works, plant schools throughout their Territorial existence, and would foster a system of discriminating and countervailing duties necessarily produces. The selection and of personal communication with California.
While dwelling with pleasing satisfaction upon the general impulse required for the want of an act of December last that instructions had been anticipated as Spain must have known that the expedition having been fully accomplished. The basis of action in public offices is established by those who promoted and facilitated by the laws on the 30th of April 29th, 1816, was the destiny of nations. The question, therefore, whether it should be enabled to judge of the other by partial agreement.
| Top words Republican | Score | Top words Democrat | Score |
|---|---|---|---|
| great | 0.0016912 | upon | -0.0023555 |
| millions | 0.0013367 | people | -0.0022939 |
| nation | 0.0013179 | mexico | -0.0020358 |
| last | 0.0012010 | constitution | -0.0019729 |
| british | 0.0010091 | public | -0.0014277 |
| spain | 0.0008060 | president | -0.0012999 |
| commerce | 0.0008033 | bank | -0.0012531 |
| nations | 0.0008010 | general | -0.0011208 |
| view | 0.0007961 | power | -0.0010619 |
| improvement | 0.0007842 | money | -0.0010279 |
| consideration | 0.0007269 | banks | -0.0010138 |
| course | 0.0006950 | state | -0.0010076 |
| peace | 0.0006660 | federal | -0.0009782 |
| progress | 0.0006492 | duty | -0.0008222 |
| thought | 0.0006437 | mexican | -0.0008132 |
| militia | 0.0006262 | thus | -0.0008050 |
| session | 0.0006239 | question | -0.0007280 |
| establishment | 0.0006038 | character | -0.0007250 |
| tribes | 0.0005943 | present | -0.0007201 |
| several | 0.0005903 | republic | -0.0007178 |
We ought to be a tough economy. I vetoed that proposal to Congress comprehensive legislation that will cover the uninsured, strengthen Medicare for older Americans. Every plan before the Congress to support what works and greater energy independence. We need to ultimately make clean, renewable energy in history, with the people who are behind to catch criminals and drug abuse and heroin abuse. So, who knows, we might perfect our Union. And despite all our children futures to say to those beyond our shores. Right now it helps about half of all children who lose their health care. Forty million Americans without health insurance industry from exploiting patients.
Our country must also act now because it means the most important institutions – a symbol of quality and progress, And where every one who has a new century, your century, on dreams we cannot see, on the offensive by encouraging economic growth, and reforms in education and support the training and launch a major al-Qaida leader in Yemen. All told, more than 3,000 suspected terrorists have chosen the weapon of fear. Some speak of an American tradition to show a certain skepticism toward our democratic institutions.
Lastly, to get an overall summary of the United States presidents’ speeches over the course of history, we did a word network analysis using the whole data to try to group different entities and their relationship among each other.
For this analysis we construct a word netork \(G=(E,V)\) with weights \(w_{ij}\) in the following way:
After constructing the network, we have the following structure:
| Min. | 1st Qu. | Median | Mean | 3rd Qu. | Max. |
|---|---|---|---|---|---|
| 1 | 1 | 1 | 1.596 | 1 | 215 |
To find which terms are related among themselves, we used the Louvain Community Detection Algorithm. The entities belonging to the same group are more related to the ones within their group than the ones outside their group. Using the default modularity parameter of 1.0, we found 7 groups.
The following graph shows the word network obtained:
[entities_network]
The size of the nodes represent the degree; the size of the label, the eigenvector centrality and the color, the modularity class.
It’s useful to zoom in into the graph to analyze some terms or groupings in particular. For example, we can see interesting cases like the following two:
The first iunteresting case of the entities network, if we focus on the green nodes, we can appreciate that most of the correspond to World War II war army terms such as Army Service Forces, German Army, Japanese Fleet, Air Force, etc.